Hierarchical relational models for document networks
نویسندگان
چکیده
منابع مشابه
Hierarchical Relational Models for Document Networks
We develop the relational topic model (RTM), a hierarchical model of both network structure and node attributes. We focus on document networks, where the attributes of each document are its words, i.e., discrete observations taken from a fixed vocabulary. For each pair of documents, the RTM models their link as a binary random variable that is conditioned on their contents. The model can be use...
متن کاملRelational Topic Models for Document Networks
We develop the relational topic model (RTM), a model of documents and the links between them. For each pair of documents, the RTM models their link as a binary random variable that is conditioned on their contents. The model can be used to summarize a network of documents, predict links between them, and predict words within them. We derive efficient inference and learning algorithms based on v...
متن کاملSparse Relational Topic Models for Document Networks
Learning latent representations is playing a pivotal role in machine learning and many application areas. Previous work on the relational topic model (RTM) has shown promise on learning latent topical representations for describing relational document networks and predicting pairwise links. However under a probabilistic formulation with normalization constraints, RTM could be ineffective in con...
متن کاملRelational Data Model in Document Hierarchical Indexing
One of the problems of the development of document indexing and retrieval applications is the usage of hierarchies. In this paper we describe a method of automatic hierarchical indexing using the traditional relational data model. The main idea is to assign continuous numbers to the words (grammatical forms of the words) that characterize the nodes in the hierarchy (concept tree). One of the ad...
متن کاملMarginally Specified Hierarchical Models for Relational Data
We present a unified approach to modelling dyadic relational data, namely that seen in social, biological and technological networks, without restriction to the binary format. The approach involves three principles: considering the marginal specification of any edge as the fundamental unit, embedding as much dependence as possible in latent structural forms, and using distributional forms that ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Annals of Applied Statistics
سال: 2010
ISSN: 1932-6157
DOI: 10.1214/09-aoas309